Conversation
|
That was fast! Thanks! |
| ) | ||
|
|
||
| # TODO: Use eos_id as ignore_id. | ||
| # tgt_key_padding_mask = decoder_padding_mask(ys_in_pad, ignore_id=eos_id) |
There was a problem hiding this comment.
It is commented out since existing models are trained with it disabled.
If it is enabled, the WER becomes worse.
We should enable it when we start to train a new model.
|
The following is the WER from the model trained by #3 and decoded with this pull-request: Epochs 14-26 are used in model averaging. I have uploaded the above checkpoints to To reproduce the decoding result:
The results are expected to become better if trained with more epochs. |
|
Great!!
…On Tue, Aug 3, 2021 at 8:16 PM Fangjun Kuang ***@***.***> wrote:
The following is the WER from the model trained by #3
<#3> and decoded with this
pull-request:
(With n-gram LM rescoring and attention decoder. The model is trained for
26 epochs)
For test-clean, WER of different settings are:
ngram_lm_scale_0.7_attention_scale_0.6 2.96 best for test-clean
ngram_lm_scale_0.9_attention_scale_0.5 2.96
ngram_lm_scale_0.7_attention_scale_0.5 2.97
ngram_lm_scale_0.7_attention_scale_0.7 2.97
ngram_lm_scale_0.9_attention_scale_0.6 2.97
ngram_lm_scale_0.9_attention_scale_0.7 2.97
ngram_lm_scale_0.9_attention_scale_0.9 2.97
ngram_lm_scale_1.0_attention_scale_0.7 2.97
ngram_lm_scale_1.0_attention_scale_0.9 2.97
ngram_lm_scale_1.0_attention_scale_1.0 2.97
ngram_lm_scale_1.0_attention_scale_1.1 2.97
ngram_lm_scale_1.0_attention_scale_1.2 2.97
ngram_lm_scale_1.0_attention_scale_1.3 2.97
ngram_lm_scale_1.1_attention_scale_0.9 2.97
Epochs 14-26 are used in model averaging.
------------------------------
I have uploaded the above checkpoints to
https://huggingface.co/csukuangfj/conformer_ctc/tree/main
To reproduce the decoding result:
1. clone the above repo containing checkpoints and put it into
conformer_ctc/exp/
2. after step 1, you should have
conformer_ctc/exp/epoch-{14,15,...,26}.pt
3. run
./prepare.sh
./conformer_ctc/decode.py --epoch 26 --avg 13 --max-duration=50
1. You should get the above result.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#4 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZFLOYRK6U225FIAUPRC2TT27MZDANCNFSM5BJ7IYRA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email>
.
|
|
Nice! I'm curious -- did you ever try to run the same thing but with MMI instead of CTC? |
yes, I am planning to do that with a pretrained P. All the related code can be found in snowfall. |
|
Merging it to avoid conflicts. |
* Fix an error in TDNN-LSTM training. * WIP: Refactoring * Refactor transformer.py * Remove unused code. * Minor fixes.
TODOs